VB calibration to improve the interface between phone recognizer and i-vector extractor

نویسنده

  • Niko Brümmer
چکیده

The EM training algorithm of the classical i-vector extractor [1, 2] is often incorrectly described as a maximum-likelihood method. The i-vector model is however intractable—the likelihood itself and the hidden-variable posteriors needed for the EM algorithm cannot be computed in closed form. We show here that the classical i-vector extractor recipe is actually a mean-field variational Bayes (VB) solution. This theoretical VB interpretation turns out to be of further use, because it also offers an interpretation of the newer phonetic i-vector extractor recipe, thereby unifying the two flavours of extractor. More importantly, the VB interpretation is also practically useful—it suggests ways of modifying existing i-vector extractors to make them more accurate. In particular, in existing methods, the approximate VB posterior for the GMM states is fixed, while only the parameters of the generative model are adapted. Here we explore the possibility of also mildly adjusting (calibrating) those posteriors, so that they better fit the generative model. In what follows, we introduce notation by summarizing the i-vector model. Then we interpret the classical recipe to deal with this intractable model as a mean-field VB algorithm. Next, we do the same with the newer phonetic i-vector recipe. Finally, we extend the phonetic recipe by allowing calibration of the phone posteriors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Handy: A real-time three color glove-based gesture recognizer with learning vector quantization

This paper presents Handy, a real-time hand gesture recognizer based on a three color glove. The recognizer is formed by three modules. The first module, fed by the frame acquired by a webcam, identifies the hand image in the scene. The second module, a feature extractor, represents the image by a nine-dimensional feature vector. The third module, the classifier, is performed by means of Learni...

متن کامل

Cursive character recognition by learning vector quantization

This paper presents a cursive character recognizer embedded in an o€-line cursive script recognition system. The recognizer is composed of two modules: the ®rst one is a feature extractor, the second one a learning vector quantizer. The selected feature set was compared to Zernike polynomials using the same classi®er. Experiments are reported on a database of about 49,000 isolated characters.

متن کامل

Phone vector DHMM to decode a phone recognizer's output

In this paper we introduce a Phone Vector Discrete HMM (PVDHMM) that decodes a phone recognizer’s output. The proposed PVDHMM treats a phone recognizer as a vector quantizer whose codebook size is equal to the size of its phone set. To examine the proposed method we perform two experiments. First, the output of a phone recognizer is recognized by the PVDHMM, and its results are compared with th...

متن کامل

Speech Recognition Using Neural Networks

Although speech recognition products are already available in the market at present, their development is mainly based on statistical techniques which work under very specific assumptions. The work presented in this thesis investigates the feasibility of alternative approaches for solving the problem more efficiently. A speech recognizer system comprised of two distinct blocks, a Feature Extrac...

متن کامل

Manawi: Using Multi-Word Expressions and Named Entities to Improve Machine Translation

We describe the Manawi1 (mAnEv) system submitted to the 2014 WMT translation shared task. We participated in the English-Hindi (EN-HI) and Hindi-English (HI-EN) language pair and achieved 0.792 for the Translation Error Rate (TER) score2 for EN-HI, the lowest among the competing systems. Our main innovations are (i) the usage of outputs from NLP tools, viz. billingual multi-word expression extr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1510.03203  شماره 

صفحات  -

تاریخ انتشار 2015